Run MCMC on binomial model Gaussian distribution, one sample Hierarchical model, two groups of Gaussian observations
Model description contains 2 parts:
\[y_i \sim \text{Binom} (p = \theta, \ size = 1) \\ \theta \sim \text{Beta}(A_{prior},B_{prior})\]
where \(\theta\) is the unknown probability of success with prior beta distribution with fixed parameters \(A,\ B\).
To check the results of simulation use analytical formulas for posterior distribution: if prior beta distribution has parameters \(A_{prior},B_{prior}\) and the data contain number of successes \(s\),
\[s = \sum_{i=1}^k y_i\]
out of \(k\) observations then posterior beta distribution has parameters
\[A_{post} = A_{prior} + s; \ B_{post} = B_{prior} + k − s\]
Create vector of \(0\) and \(1\) of length \(k=20\) and probability of success \(\theta = 0.6\)
# Y <- rbinom(41, size=1, p=.32)
= 41
flips = 13
heads <- 10; b <- 10 a
Recall that Metropolis-Hastings MCMC algorithm consists of the following steps:
Generate new proposal using some convenient distribution.
Decide whether the proposal theta_prop
is accepted
or rejected
To do that follow the logic:
theta_prop
is
greater than posterior probability density at theta_curr
then accept theta_prop
into the chain sample.theta_prop
is
smaller than posterior probability density at theta_curr
then accept theta_prop
with probability\[p_{dec} = \frac{p(\theta_{prop} \mid y)}{p(\theta_{curr} \mid y)} = \frac{p(y \mid \theta_{prop}) \ p(\theta_{prop} \mid A, B)}{p(y \mid \theta_{curr}) \ p(\theta_{curr} \mid A, B)} \]
and reject theta_prop
with probability
$1-p_{dec}$
.
theta_prop
is accepted.<- function(samples, theta_seed, sd){
metropolis_algorithm <- theta_seed
theta_curr # Create vector of NAs to store sampled parameters
<- rep(NA, times = samples)
posterior_thetas for (i in 1:samples){
# Typically new proposals are generated from Gaussian distribution centered at the current value
# of Markov chain with sufficiently small variance.
<- rnorm(n = 1, mean = theta_curr, sd = sd) # Proposal distribution
theta_prop # If the proposed parameter is outside its range then set it equal to its current
# value. Otherwise keep the proposed value
<- ifelse((theta_prop < 0 | theta_prop > 1), theta_curr, theta_prop)
theta_prop
# Bayes' numerators
<- dbeta(theta_prop, shape1 = a, shape2 = b) * dbinom(heads, size = flips, prob = theta_prop)
posterior_prop <- dbeta(theta_curr, shape1 = a, shape2 = b) * dbinom(heads, size = flips, prob = theta_curr)
posterior_curr # Calculate probability of accepting
<- min(posterior_prop/posterior_curr, 1.0)
p_accept_theta_prop <- runif(n = 1)
rand_unif # Probabilistically accept proposed theta
<- ifelse(p_accept_theta_prop > rand_unif, theta_prop, theta_curr)
theta_select <- theta_select
posterior_thetas[i] # Reset theta_curr for the next iteration of the loop
<- theta_select
theta_curr
}return(posterior_thetas)
}
set.seed(12423)
<- metropolis_algorithm(samples = 10000, theta_seed = 0.9, sd = 0.05) posterior_thetas
After simulating Markov chain check the results.
Compare prior parameters and posterior parameters given by formulas
It was due to since the beta distribution is conjugate to the binomial distribution (above), we can see how close our sampled posterior distribution approximates the exact posterior for \(\theta\). Recall that with a \(Beta(10, 10)\) prior and with \(13\) heads in \(41\) flips, the posterior distribution is also beta distributed with \(a = 10+13\) and \(b = 10+(41−13)\). So, let’s overlay the exact posterior distribution on our previous graph also.
<- par()
opar par(mar=c(2.5,3.5,3,2.1), mgp = c(1.7, 1, 0))
<- density(posterior_thetas)
d plot(d,
main = expression(paste('Kernel Density Plot for ', theta)),
xlab = expression(theta),
ylab = 'density',
yaxt = 'n',
cex.lab = 1.3
)polygon(d, col='dodgerblue1', border='dark blue')
<- rbeta(n = 10000, shape1 = 23, shape2 = 38)
exactb lines(density(exactb), # Plot of randomly drawn beta density
type = 's', col = 'red')
hdi(posterior_thetas, credMass = 0.95)
lower upper
0.2563125 0.4986970
attr(,"credMass")
[1] 0.95
Simulate single sample from Gaussian distribution with
unknown mean
\(\theta\)
and known standard deviation
\(\sigma\).
<- 120
mu <- 25
si <- 200
nSample set.seed(05152022)
<- rnorm(nSample, mean = mu, sd = si)
Y <- c(Mean = mu, Sd = si)
Theoretical.data plot(Y)
Estimate mean of the distribution, check histogram, test hypothesis \(H_0\): \(\mu = 120\) against two-sided alternative \(H_a\): \(\mu \neq 120\).
<- mean(Y)
meanMLE hist(Y, freq = F)
plot(density(Y))
<- (meanMLE - Theoretical.data["Mean"])/(Theoretical.data["Sd"]/sqrt(nSample))) (zStat
Mean
0.3157925
<- pnorm(zStat,lower.tail = F) + pnorm(-zStat,lower.tail = T)) (sSidedP
Mean
0.75216
Finally, distribution of the data estimated by FNP:
<- c(Mean = meanMLE, Sd = Theoretical.data["Sd"])) (MLE.data
Mean Sd.Sd
120.5582 25.0000
Set the model as \[y_i \sim N(\theta, \sigma)\]
where \(\theta\) is unknown and
\(\sigma = 25\).
Set prior distribution for the mean value as low informative Gaussian
with parameters \(M = 100\), \(\sigma_{\mu} = 200\), i.e.
\[ \theta \sim N(M, \sigma_{\mu})\]
<- 100
M <- c(Mean = M, Sd = 200) priorParamWeak
Obtain posterior distribution using formulas for conjugate Gaussian distribution:
\[ \mu_{post} = \frac{\mu_{prior} \gamma_{prior} + \bar{y} \gamma_{data}}{\gamma_{prior} + \gamma_{data}} = \frac{M \gamma_{prior} + \bar{y} \gamma_{data}}{\gamma_{prior} + \gamma_{data}} = M \frac{\gamma_{prior}}{\gamma_{prior} + \gamma_{data}} + \bar{y} \frac{\gamma_{data}}{\gamma_{prior} + \gamma_{data}} \]
\[ \sigma_{post}^2 = \frac{1}{\gamma_{prior} + \gamma_{data}} \]
where \(n\) is the sample size and \(\gamma_{data}\) and \(\gamma_{prior}\) are data precision and prior precision, correspondingly:
\[ \gamma_{data} ; \gamma_{prior} = \frac{1}{\sigma_{\mu}^2} \]
<- 1/200^2
precision_prior <- nSample/25^2
precision_data <- 1/(precision_prior + precision_data)
pSigmasq <- sqrt(pSigmasq)
pSd <- 100*precision_prior*pSigmasq + meanMLE*precision_data*pSigmasq
pMean <- c(Mean = pMean, Sd = pSd)) (posterParamWeak
Mean Sd
120.556641 1.767698
Bayesian estimate is obtained as \(\mu_{post}\) = 120.5566415, showing no compromise between likelihood 120.5582475 and prior 100.
Plot theoretical (simulated), estimated by MLE and estimated by Bayesian analysis densities.
<- c(posterParamWeak["Mean"], Theoretical.data["Sd"])
Bayes.dataWeak <- seq(from = 115, to = 125, by = 1)
X plot(X, dnorm(X, Theoretical.data[1], Theoretical.data[2]),
type="l", xlab="X", ylab="Density")
lines(X, dnorm(X,meanMLE,si), col="orange", lwd=3)
lines(X, dnorm(X,posterParamWeak[1],si), col="blue")
legend("topright",
legend = c("Theoretical", "MLE", "Bayesian"),
col = c("black", "orange", "blue"),
lty = 1)
For the sample length \(200\) and low informative prior Bayesian estimate is very close to MLE in comparison with the prior. In addition standard deviation of the posterior distribution collapsed from \(200\) to 1.7676979.
rbind(Theoretical.data, MLE.data, Bayes.dataWeak)
Mean Sd
Theoretical.data 120.0000 25
MLE.data 120.5582 25
Bayes.dataWeak 120.5566 25
rbind(priorParamWeak, posterParamWeak)
Mean Sd
priorParamWeak 100.0000 200.000000
posterParamWeak 120.5566 1.767698
Repeat Bayesian analysis with stronger prior \(M = 100, \ \sigma_{\mu} = 2\).
<- c(Mean = M, Sd = 2)
priorParamStrong <- 1/2^2
precision_prior <- nSample/25^2
precision_data <- 1/(precision_prior + precision_data)
pSigmasq <- sqrt(pSigmasq)
pSd <- 100*precision_prior*pSigmasq + meanMLE*precision_data*pSigmasq
pMean <- c(Mean = pMean, Sd = pSd)
posterParamStrong rbind(priorParamStrong, posterParamStrong)
Mean Sd
priorParamStrong 100.0000 2.000000
posterParamStrong 111.5415 1.324532
Bayesian estimate of the mean shifted towards mode of the prior distribution: now the mean of posterior distribution is obtained by 111.5414723
Shift of the lower level parameter towards mode of the higher level parameter is shrinkage.
Prepare the data list dataList
in the format required by
JAGS
<- list("Y" = Y, #data vector
dataList "Ntotal" = nSample) # length of the sample
The model diagram:
Interpretation of the diagram starts from the bottom. Each arrow of the diagram corresponds to a line of description code for the model.
Line 1 of the model describes generation of the data according to the
likelihood function: \(y_i \sim N(\mu,
\sigma)\).
Line 2 of the model describes generation of the parameter θ from the
prior distribution: \(\mu \sim N(M,
\sigma_{\mu})\).
In this case parameters of the prior distribution should be defined. For
example, like above set \(M = 100,
\sigma_{\mu} = 200\).
The format of the model description for JAGS is a string of the form:
= "
model1NormalSample model {
# Description of data
for (i in 1:Ntotal){
Y[i] ~ dnorm(mu, 1/sigma^2) #tau = 1/sigma^2
}
# Description of prior
mu ~ dnorm(100 , Ntotal/200^2)
#tau <- pow(sigma, -2) #JAGS uses precision
sigma <- 25
}
"
Data are described as for
loop.
Prior is described as typical formula:
“lower order parameter $\sim$
density(higher order
parameters)”.
Note. In JAGS normal distribution is specified with
mean value and precision, i.e. instead of
$dnorm(\mu, \sigma)$
use
$dnorm(\mu, \frac{1}{\sigma^2})$
.
Note that variables names for the data and data length
Y
, nSample
in the description have to match
the names of the data list.
The next step of preparation of the model description for JAGS is
saving it to a temporary text file Tempmodel.txt
.
writeLines(model1NormalSample, con="Tempmodel.txt")
To initialize trajectories define a list of lists with several values
or create a function that returns an init value every time it is
called.
JAGS will call this function when it starts a new chain.
General syntax of initiation function is:
<-function() {
initsList# Definition of mu and sigma Init
<-sample(c(1,-1),1)
upDown<- mean(Y)*(1+upDown*.05)
m
list(mu = m)
}
For example, select starting point from normal distribution centered at MLE with some reasonable standard deviation.
Check how initiation works by running the function several times.
initsList()
$mu
[1] 126.5862
Next step is getting all information to JAGS using
jags.model()
from library rjags
.
This function transfers the data, the model specification and the
initial values to JAGS and requests JAGS to select appropriate sampling
method.
<- jags.model(file = "TempModel.txt", data = dataList, n.chains = 3, n.adapt = 500) jagsModel
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 200
Unobserved stochastic nodes: 1
Total graph size: 211
Initializing model
In jags.model
parameter n.chains
specifies
the number of chains to run (defaults to 1) and parameter
n.adapt
sets the number of steps JAGS can take to tune the
sampling method (defaults to 1000).
The object returned by jags.model
contains all
information that we need to communicate to JAGS about the problem in the
format suitable for JAGS.
Now run JAGS chains for 600 steps to complete burn in, i.e. transition to a stable distribution of Markov chain.
update(jagsModel,n.iter=600)
After completing burn in generate MCMC trajectories representing the posterior distribution for the model.
<- coda.samples(jagsModel, variable.names=c("mu"), n.iter = 3334)
codaSamples list.samplers(jagsModel)
$`bugs::ConjugateNormal`
[1] "mu"
head(codaSamples)
[[1]]
Markov Chain Monte Carlo (MCMC) output:
Start = 601
End = 607
Thinning interval = 1
mu
[1,] 122.0174
[2,] 120.6580
[3,] 121.8346
[4,] 121.1977
[5,] 122.0095
[6,] 123.1905
[7,] 121.9572
[[2]]
Markov Chain Monte Carlo (MCMC) output:
Start = 601
End = 607
Thinning interval = 1
mu
[1,] 118.4939
[2,] 118.1665
[3,] 119.8600
[4,] 123.0622
[5,] 119.9395
[6,] 121.2756
[7,] 120.5576
[[3]]
Markov Chain Monte Carlo (MCMC) output:
Start = 601
End = 607
Thinning interval = 1
mu
[1,] 118.0154
[2,] 121.2776
[3,] 119.6511
[4,] 119.7407
[5,] 119.4734
[6,] 121.0419
[7,] 121.6894
attr(,"class")
[1] "mcmc.list"
Besides the model specification object coda.samples()
takes a vector of character strings corresponding to the names of
parameters to record variable.names
and the number of
iterations to run n.iter
.
In this example there are 3 chains, to the total number of iterations
will be about 10,000.
Use function list.samplers
to show the samplers applied to
the model.
Analyze convergence of chains using following tools:
summary(codaSamples)
Iterations = 601:3934
Thinning interval = 1
Number of chains = 3
Sample size per chain = 3334
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
120.21874 1.74212 0.01742 0.01764
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
116.9 119.0 120.2 121.4 123.6
::traceplot(codaSamples) coda
densplot(codaSamples)
plot(codaSamples)
autocorr.plot(codaSamples,ask=F)
effectiveSize(codaSamples)
mu
9761.497
The ESS number is consistent with the autocorrelation function.
gelman.diag(codaSamples)
Potential scale reduction factors:
Point est. Upper C.I.
mu 1 1.01
gelman.plot(codaSamples)
<- lapply(codaSamples,function(z) mean(z))
chainMeans <- lapply(codaSamples,function(z) sd(z))
chainSd rbind(Means = chainMeans, SD = chainSd)
[,1] [,2] [,3]
Means 120.1994 120.2765 120.1803
SD 1.752507 1.755037 1.717595
Compare the posterior densities generated by 3 chains with analytical posterior distribution:
<-min(unlist(codaSamples))-.05) (l
[1] 112.8358
<-max(unlist(codaSamples))+.05) (h
[1] 127.7117
<-seq(l,h,by=.05)
histBreaks<-lapply(codaSamples,hist,breaks=histBreaks) postHist
plot(postHist[[1]]$mids,postHist[[1]]$density,type="l",
col="black",lwd=2,ylab="Distribution Density",xlab="Theta")
lines(postHist[[2]]$mids,postHist[[2]]$density,type="l",col="red",lwd=2)
lines(postHist[[3]]$mids,postHist[[3]]$density,type="l",col="blue",lwd=2)
lines(postHist[[3]]$mids,
dnorm(postHist[[3]]$mids,posterParamWeak["Mean"],posterParamWeak["Sd"]),
type="l",col="green",lwd=3)
legend("topright",legend=c("Chain1","Chain2","Chain3","Theoretical"),col=c("black","red","blue","green"),lwd=2)
Consider now two groups of Gaussian observations with unknown means \(\mu_1, \mu_2\) and same known standard deviation \(\sigma = 25\).
Create data, prepare them for JAGS.
Keep the first group sample as in the previous section: theoretical mean
120 and theoretical known standard deviation 25.
Simulate second sample with the same standard deviation and theoretical
mean 150.
Combine both samples together and add second column containing group
number.
set.seed(05162022)
<- rnorm(nSample, mean = 150, sd = 25)
Y2 <- rbind(cbind(Y, rep(1, nSample)), cbind(Y2, rep(2, nSample)))
Y2 colnames(Y2) <- c("y","s")
<- density(subset(Y2,Y2[,2]==1)[,1])
den1 <- density(subset(Y2,Y2[,2]==2)[,1])
den2 plot(den1$x, den1$y, type="l", ylim=c(0,.02))
lines(den2$x, den2$y, col="blue")
The sample from both groups now is Y2.
Create data for JAGS.
<- Y2[,"y"]
y <- Y2[,"s"]
s <- length(y)
nSample <- length(unique(s))
nGr <- list(y = y, s = s, nSample = nSample, nGr = nGr)
dataList names(dataList)
[1] "y" "s" "nSample" "nGr"
In this section consider model structure with common weak prior for
mean values of both groups.
Think about situations in which common prior is a reasonable
assumption.
The model diagram convenient for coding it in JAGS or Stan looks like.
An equivalent diagram may help understanding difference between hierarchical model of random effects and non-hierarchical model of fixed effects.
Select hyper-parameters of the common prior normal distribution as in the previous section: \(M=100, \ \sigma_{\mu}=200\).
Prepare the model string.
The difference from one-group data description is that now prior
distribution for \(\mu\) is described
in a loop over all groups.
= "
model2NormalSample model {
# Description of data
for (i in 1:nSample){
y[i] ~ dnorm(mu[s[i]], 1/sigma^2) #tau = 1/sigma^2
}
# Description of prior
#mu[1:nGr] <- c(100,100)
for (j in 1:nGr) {
mu[j] ~ dnorm(M, nSample/sigma_mu^2)
}
sigma <- 25
M <- 100
sigma_mu <- 200
}
"
writeLines(model2NormalSample, con="Tempmodel2.txt")
Initialize chains randomly around MLE.
<-function() {
initsList# Definition of mu and sigma Init
<- vector()
theta for (i in 1:nGr){
<-sample(c(1,-1),1)
upDown<- mean(Y2[which(s==i)],)*(1+upDown*.05)
theta[i]
}list(theta = theta)
}# Check initiation
initsList()
$theta
[1] 126.5862 159.6764
Send the model to JAGS.
Set main parameters:
jags.model()
to transfer information to JAGS. Arguments
for this function are: model string, data list, initiation function
name, number of chains and number of steps for adaptation.<- jags.model(file = "data/Tempmodel2.txt", data = dataList, n.chains = 3, n.adapt = 500) jagsModel2
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 400
Unobserved stochastic nodes: 2
Total graph size: 813
Initializing model
Run burn in by using update()
.
update(jagsModel2, n.iter = 600)
Make the main run by using coda.samples()
.
<- c("mu")
parameters <- 10000
nIter = coda.samples(jagsModel2,
codaSamples2Groups1Prior variable.names = parameters,
n.iter = nIter)
head(codaSamples2Groups1Prior)
[[1]]
Markov Chain Monte Carlo (MCMC) output:
Start = 601
End = 607
Thinning interval = 1
mu[1] mu[2]
[1,] 118.6467 149.0570
[2,] 122.5462 151.6248
[3,] 118.8301 145.8869
[4,] 120.6572 150.9358
[5,] 121.1533 146.9245
[6,] 119.5723 151.1555
[7,] 121.0751 151.1671
[[2]]
Markov Chain Monte Carlo (MCMC) output:
Start = 601
End = 607
Thinning interval = 1
mu[1] mu[2]
[1,] 122.9129 151.3536
[2,] 120.3069 147.1769
[3,] 115.7560 151.4156
[4,] 120.6509 149.5604
[5,] 117.8543 151.9259
[6,] 118.6829 154.2494
[7,] 119.4651 151.7435
[[3]]
Markov Chain Monte Carlo (MCMC) output:
Start = 601
End = 607
Thinning interval = 1
mu[1] mu[2]
[1,] 118.0450 149.2746
[2,] 119.8209 149.0857
[3,] 125.1278 151.2497
[4,] 116.9475 150.0177
[5,] 117.3252 150.1579
[6,] 118.6826 148.6312
[7,] 120.5690 152.0444
attr(,"class")
[1] "mcmc.list"
list.samplers(jagsModel2)
$`bugs::ConjugateNormal`
[1] "mu[2]"
$`bugs::ConjugateNormal`
[1] "mu[1]"
Analyze convergence.
summary(codaSamples2Groups1Prior)
Iterations = 601:10600
Thinning interval = 1
Number of chains = 3
Sample size per chain = 10000
1. Empirical mean and standard deviation for each variable,
plus standard error of the mean:
Mean SD Naive SE Time-series SE
mu[1] 120.0 1.735 0.01002 0.01002
mu[2] 150.5 1.735 0.01001 0.01001
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
mu[1] 116.5 118.8 120.0 121.1 123.4
mu[2] 147.1 149.3 150.5 151.7 153.9
plot(codaSamples2Groups1Prior)
autocorr.plot(codaSamples2Groups1Prior, ask = F)
effectiveSize(codaSamples2Groups1Prior)
mu[1] mu[2]
29965.46 30000.00
gelman.diag(codaSamples2Groups1Prior)
Potential scale reduction factors:
Point est. Upper C.I.
mu[1] 1 1
mu[2] 1 1
Multivariate psrf
1
gelman.plot(codaSamples2Groups1Prior)
Check estimated means.
matrix(unlist(lapply(codaSamples2Groups1Prior,function(z) apply(z,2,mean))),ncol=3)
[,1] [,2] [,3]
[1,] 119.9629 119.9299 119.9627
[2,] 150.5279 150.4870 150.5058
Plot posterior densities.
plot(density(codaSamples2Groups1Prior[[1]][,1]),xlim=c(110,160),ylim=c(0,.25))
lines(density(codaSamples2Groups1Prior[[1]][,2]))
lines(density(codaSamples2Groups1Prior[[2]][,1]),col="orange")
lines(density(codaSamples2Groups1Prior[[2]][,2]),col="orange")
lines(density(codaSamples2Groups1Prior[[3]][,1]),col="blue")
lines(density(codaSamples2Groups1Prior[[3]][,2]),col="blue")
Calculate HDIs for each chain.
lapply(codaSamples2Groups1Prior,function(z) hdi(as.matrix(z)))
[[1]]
mu[1] mu[2]
lower 116.6673 147.1416
upper 123.3728 153.8997
attr(,"credMass")
[1] 0.95
[[2]]
mu[1] mu[2]
lower 116.6291 146.9730
upper 123.4452 153.7687
attr(,"credMass")
[1] 0.95
[[3]]
mu[1] mu[2]
lower 116.7187 147.1051
upper 123.5352 153.9668
attr(,"credMass")
[1] 0.95
Find differences between \(\mu_1\) and \(\mu_2\).
<- lapply(codaSamples2Groups1Prior, function(z) z[,2]-z[,1])
chainDiffs lapply(chainDiffs, function(z) hdi(as.matrix(z)))
[[1]]
var1
lower 25.68099
upper 35.32636
attr(,"credMass")
[1] 0.95
[[2]]
var1
lower 25.63348
upper 35.25554
attr(,"credMass")
[1] 0.95
[[3]]
var1
lower 25.55780
upper 35.19627
attr(,"credMass")
[1] 0.95
Find left 95% HDI boundaries for the chain differences and plot them.
<-unlist(lapply(chainDiffs,function(z) hdi(as.matrix(z))[1]))) (leftBounds
[1] 25.68099 25.63348 25.55780
plot(density(chainDiffs[[1]]),xlim=c(15,35),ylim=c(0,.2),col="black")
lines(density(chainDiffs[[2]]),col="red")
lines(density(chainDiffs[[3]]),col="blue")
abline(v=leftBounds,col=c("black","orange","blue"))
For the given sample and prior distribution mean values of the 2 groups are clearly different based on 95% HDI for all 3 chains.
Repeat calculations with different prior hyperparameters.
To do that organize all steps of running MCMC with JAGS in a function
runMCMC2Groups <- function(prMean, prSD, dat)
, where
prMean
and prSD
are the the prior
hyperparameters and dat
is the data list.
Note that in order to pass arguments of the function to JAGS model description you need to append them to the data list:
hyper <- list(hypeMean = prMean, hypeSD = prSD)
jags.model()
:
jags.model("TEMPmodel.txt", data=append(dat,hyper),...)
dnorm(hypeMean, 1/hypeSD^2)
.<- function(prMean, prSD, dat){
runMCMC2Groups = "
model2NormalSamplesb model {
for (i in 1:nSample){
y[i] ~ dnorm(theta[s[i]], 1/25^2)
}
for (sIdx in 1:nGr) {# Different thetas from same prior
theta[sIdx] ~ dnorm(hypeMean,1/hypeSD^2)
}
}
" # close quote for modelString
writeLines(model2NormalSamplesb, con="data/TEMPmodel3.txt")
= c("theta") # The parameters to be monitored
parameters = 500 # Number of steps to adapt the samplers
adaptSteps = 500 # Number of steps to burn-in the chains
burnInSteps = 3 # nChains should be 2 or more for diagnostics
nChains <- 50000
numSavedSteps = ceiling(numSavedSteps/nChains )
nIter <- list(hypeMean = prMean, hypeSD=prSD)
hyper
# Create, initialize, and adapt the model:
= jags.model("data/TEMPmodel3.txt",
jagsModel data = append(dat,hyper),
inits = initsList,
n.chains = nChains,
n.adapt = adaptSteps)
update(jagsModel , n.iter=burnInSteps)
<- coda.samples(jagsModel,
codaSamplesResult variable.names = parameters,
n.iter = nIter)
codaSamplesResult }
130.200 <- runMCMC2Groups(130, 200, dataList) Run.
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 400
Unobserved stochastic nodes: 2
Total graph size: 813
Initializing model
lapply(Run.130.200,function(z) apply(z, 2, mean))
[[1]]
theta[1] theta[2]
120.5531 152.0436
[[2]]
theta[1] theta[2]
120.5736 152.0886
[[3]]
theta[1] theta[2]
120.5378 152.0711
<-lapply(Run.130.200,function(z) z[,2]-z[,1])
chainDiffslapply(chainDiffs,function(z) hdi(as.matrix(z)))
[[1]]
var1
lower 26.66148
upper 36.45065
attr(,"credMass")
[1] 0.95
[[2]]
var1
lower 26.49189
upper 36.27758
attr(,"credMass")
[1] 0.95
[[3]]
var1
lower 26.75067
upper 36.52194
attr(,"credMass")
[1] 0.95
130.2<-runMCMC2Groups(130, .2, dataList) Run.
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 400
Unobserved stochastic nodes: 2
Total graph size: 813
Initializing model
lapply(Run.130.2,function(z) apply(z,2,mean))
[[1]]
theta[1] theta[2]
129.8797 130.2782
[[2]]
theta[1] theta[2]
129.8807 130.2812
[[3]]
theta[1] theta[2]
129.8832 130.2785
<- lapply(Run.130.2, function(z) z[,2]-z[,1])
chainDiffs lapply(chainDiffs,function(z) hdi(as.matrix(z)))
[[1]]
var1
lower -0.1498324
upper 0.9505163
attr(,"credMass")
[1] 0.95
[[2]]
var1
lower -0.1391972
upper 0.9614988
attr(,"credMass")
[1] 0.95
[[3]]
var1
lower -0.1693627
upper 0.9252938
attr(,"credMass")
[1] 0.95
100.2 <- runMCMC2Groups(100, .2, dataList) Run.
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 400
Unobserved stochastic nodes: 2
Total graph size: 813
Initializing model
lapply(Run.100.2, function(z) apply(z,2,mean))
[[1]]
theta[1] theta[2]
100.2627 100.6598
[[2]]
theta[1] theta[2]
100.2617 100.6603
[[3]]
theta[1] theta[2]
100.2594 100.6590
<- lapply(Run.100.2, function(z) z[,2]-z[,1])
chainDiffs lapply(chainDiffs, function(z) hdi(as.matrix(z)))
[[1]]
var1
lower -0.1525038
upper 0.9457228
attr(,"credMass")
[1] 0.95
[[2]]
var1
lower -0.1577104
upper 0.9359693
attr(,"credMass")
[1] 0.95
[[3]]
var1
lower -0.1635975
upper 0.9365602
attr(,"credMass")
[1] 0.95
170.2 <- runMCMC2Groups(170, .2, dataList) Run.
Compiling model graph
Resolving undeclared variables
Allocating nodes
Graph information:
Observed stochastic nodes: 400
Unobserved stochastic nodes: 2
Total graph size: 813
Initializing model
lapply(Run.170.2,function(z) apply(z,2,mean))
[[1]]
theta[1] theta[2]
169.3763 169.7759
[[2]]
theta[1] theta[2]
169.3778 169.7718
[[3]]
theta[1] theta[2]
169.3746 169.7765
<- lapply(Run.170.2, function(z) z[,2]-z[,1])
chainDiffs lapply(chainDiffs, function(z) hdi(as.matrix(z)))
[[1]]
var1
lower -0.1519031
upper 0.9611009
attr(,"credMass")
[1] 0.95
[[2]]
var1
lower -0.1523604
upper 0.9345152
attr(,"credMass")
[1] 0.95
[[3]]
var1
lower -0.1489817
upper 0.9523275
attr(,"credMass")
[1] 0.95
Explain the results of running hierarchical model with different sets
of hyperparameters:
- more informative hypeSD would lead to narrow the difference between 2
means
- with informative hypeSD, the poster would be mainly affected by the
prior
Compare results with fixed effects ANOVA model.
head(Y2)
y s
[1,] 128.41478 1
[2,] 99.92492 1
[3,] 149.86236 1
[4,] 178.42914 1
[5,] 96.67612 1
[6,] 115.41590 1
<- lm(y~as.factor(s), as.data.frame(Y2))
mANOVA summary(mANOVA)
Call:
lm(formula = y ~ as.factor(s), data = as.data.frame(Y2))
Residuals:
Min 1Q Median 3Q Max
-71.077 -18.636 0.522 16.592 78.619
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 120.558 1.782 67.65 <2e-16 ***
as.factor(s)2 31.515 2.520 12.51 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 25.2 on 398 degrees of freedom
Multiple R-squared: 0.2821, Adjusted R-squared: 0.2803
F-statistic: 156.4 on 1 and 398 DF, p-value: < 2.2e-16
anova(mANOVA)
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
as.factor(s) 1 99316 99316 156.37 < 2.2e-16 ***
Residuals 398 252793 635
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/hai-mn/hai-mn.github.io, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Nguyen (2022, Jan. 1). HaiBiostat: Series 2 of 10 -- MCMC for Estimation of Gaussian parameters. Retrieved from https://hai-mn.github.io/posts/2022-01-01-Bayesian methods - Series 2 of 10/
BibTeX citation
@misc{nguyen2022series, author = {Nguyen, Hai}, title = {HaiBiostat: Series 2 of 10 -- MCMC for Estimation of Gaussian parameters}, url = {https://hai-mn.github.io/posts/2022-01-01-Bayesian methods - Series 2 of 10/}, year = {2022} }